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A general method is proposed which allows one to estimate drift and diffusion coefficients of a 
stochastic process governed by a Langevin equation, ft extends a previously devised approach [R. 
Friedrich et al., Physics Letters A 271, 2f7 (2000)], which requires sufficiently high sampling rates. 
The analysis is based on an iterative procedure minimizing the Kullback-Leibler distance between 
measured and estimated two time joint probability distributions of the process. 
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I. INTRODUCTION 

Complex behavior in systems far from equilibrium can 
quite often be traced back to rather simple laws due to 
the existence of processes of selforganization Since 
complex systems are composed of a huge number of sub- 
systems, however, fluctuations stemming from the micro- 
scopic degrees of freedom play an important role intro- 
ducing a temporal variation on a fast time scale which 
quite often can be considered as fluctuations. The con- 
sequence is the existence of evolution equations of a set 
of macroscopic order parameters qjfi) which are governed 
by nonlinear Langevin equations @ - Q : 



(1) 



where q(t) denotes the n-dimensional state vector, D 1 (q) 
is the drift vector and the matrix g(q) is related to the dif- 
fusion matrix according to (£> 2 (q)) .. = Y, k 9ik{^}9jk{^)- 
F(i) are fluctuating forces with Gaussian statistics delta- 
correlated in time: < Fi(t) >= 0, < Fi(t)F k (f) >= 
28ikS(t — t'). Here and in the following we adopt Ito's 
interpretation of stochastic integrals @ ■ • 

Analyzing complex systems, which can be described by 
stochastic equations of the form QJ, therefore, amounts 
to assess the underlying Langevin equations or the cor- 
responding Fokkcr-Planck equations from an inspection 
of experimentally determined time series Q. Recently, 
an operational method has been devised, which 
allows one to estimate drift and diffusion coefficients of 
the stochastic processes from experimental data. This 
method has been successfully applied to various prob- 
lems in the field of complex systems like the analysis of 
noisy electrical circuits 6] , stochastic dynamics of metal 
cutting pj , systems with feedback delay Q , meteorologi- 
cal processes like wind-driven Southern Ocean variability 
, traffic flow data an d physiological time series |12| . 
Furthermore it has been applied to problems like turbu- 
lent flows 0, [13, passive scalar advection finan- 
cial time series Hi , analysis of rough surfaces |17| , 0] , 



which can be characterized as a stochastic process with 
respect to a scale variable exhibiting markovian proper- 
ties in scale. 

The method is based on the evaluation of the time 
limits the first and second conditional moments, 



lim — < q(f + t) 

t^0 T 



• q(t)|q(t) = q > (2a) 



lim — < [q(t + t) - qft)]i 

T^0 IT 

[q(* + T)-q(t)] J |q(t) = q> 



(2b) 



From these expressions it becomes evident that the sam- 
pling rate in the experiments has to be sufficiently high 
in order to allow for a reliable evaluation of the limit 
t — > 0. Therefore, in all applications mentioned above 
the results have been checked in a selfconsistent manner 
by a recalculation of conditional pdf 's from the estimated 
Fokker-Planck equation. Possible problems in estimating 
drift and diffusion coefficients related with low sampling 
frequencies have been adressed by Sura 19], Ragwitz and 
Kantz H3, [13 and Friedrich et al. |22|. 

The aim of the present letter is to devise an exten- 
sion of the above method in order to overcome problems 
related with the time limit r — > 0. These problems imme- 
diately show up for low sampling rates. We also want to 
point out that for the case of stochastic forces F(t) with 
small but finite temporal correlations the process is not 
markovian in the limit t —> 0. In this case, however, one 
should use the Stratonovich interpretation of stochastic 
processes 0- 



II. DESCRIPTION OF THE METHOD 

The starting point is a first estimate of drift and diffu- 
sion coefficients by the expressions © evaluated for the 
smallest reliably possible values of r. The second step 
is an embedding of drift and diffusion coefficients into a 
family of functions D : (q, a), D 2 (q, a) parameterized by 
a set of free parameters a. The expressions obtained in 
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the first step already yield a crude estimate of the pa- 
rameters a. The third step consists in optimizing the 
free parameters a. 

Optimization of the free parameters can be performed 
in the following way. One determines the conditional 
probability distribution 



p(q,%o,£o;er) 



(3) 



for the parameter set a either by a simulation of the 
Langevin equations or by a numerical solution of the 
corresponding Fokker-Planck equation. In each case, 
one can determine the two point pdf /(q, t; qo, to; a) — 
p(q, £|qo, to; c)/(qo, to) ■ The reader should note that 
this may be done for various finite values of t — to- The 
obtained two time pdf can now be compared with the 
experimental one. A suitable measure for the distance is 
the Kullback-Leibler information 4] defined according to 



K(a,t,t Q ) 



x In 



dq J dq / erp (q,t;q ,<o) 

/exp(q, t; qp, tp) 
f(q,t;q ,t ,a) 



(4) 



The minimum of the Kullback-Leibler information 
with respect to the parameters a yields estimates of drift 
and diffusion of a stochastic process. This process is the 
best approximation with respect to this measure in the 
class of stochastic processes characterized by the param- 
eters a. The problem of identifying a stochastic process 
is then equivalent to determining a minimum of the Kull- 
back information. In practice the minimum can be de- 
termined by gradient or genetic algorithms and solved by 
standard methods [24| • In the following we shall consider 
cases, where it is possible to obtain a parametrization of 
the stochastic processes by only few parameters a such 
that the Kullback-Leibler measure can be investigated by 
graphical means. 



III. EXAMPLES 




FIG. 1: Segment of the one-dimensional synthetic time series 
I. 
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FIG. 2: Fullback distance K(Q, a) as function of the parame- 
ters Q and a for time series I. The lines are equidistant contour 
lines starting from 2.6 • 10 -4 in the center. The distance be- 
tween contour lines is 5 • 10 -5 . A clear minimum is located at 
(Q,a) = (1,1). 



For certain classes of stochastic processes the above 
procedure can be reduced considerably by the fact that 
only few free parameters for the parametrization of drift 
and diffusion terms have to be introduced. As a con- 
sequence the minimization procedure of the Kullback- 
Leibler information is greatly facilitated. 

A. One dimensional systems 

The case of one-dimensional systems allows for the fol- 
lowing treatment due to the fact that the stationary pdf, 
which is assumed to exist, can be determined analyti- 
cally: 



As a consequence, we have the relationship 

D l {q)=D\ q )±\nf{q) + ±D\q) . (6) 
dq dq 

Since f(q) can be determined from the time series an 
estimate in terms of a parameterized ansatz for the dif- 
fusion term suffices. In fact, one may use the ansatz 
D 2 (q) = Q + aq 2 + bq 4 + . . . , which helps in lowering the 
number of parameters a to be estimated by the above 
procedure of minimization the Kullback-Leibler informa- 
tion. The drift then follows from ©. 

Let us consider system I with drift and diffusion func- 
tions 

D 1 (q) = q-q 3 and D 2 {q) = l + q 2 (7) 



(5) 



driven by a multiplicative noise term. We use syn- 
thetic data obtained by numerical integration of the cor- 



3 



responding Langevin equation Q , 

q(t + f)=q(t)+fD 1 [q(t)} + VfD 2 [q(t)}T(t) . (8) 

A time series containing 10 6 points with time increment 
10~ 2 was generated. The intrinsic increment f used 
for numerical integration of the corresponding Langevin 
equation was 1CP 5 . A time segment of the data is pre- 
sented in fig. ^ Since the stochastic process is stationary 
and ergodic all statistical quantities can be retrieved from 
this data. 

For the estimation of the pdf 's from data state space 
has to be divided into bins. We used 100 equidistant 
bins for the stationary pdf. A very accurate way to cal- 
culate the integral yielding the Kullback-Leibler distance 
without running out of memory even for higher dimen- 
sional data is to use an adequate local grid for the first 
argument (the destination) of the conditional pdf 's. The 
conditional pdf then locally can be retrieved from the 
data for any (q, q ) with high accuracy. The local grid 
used in this example covered 20 equidistant bins. 

During the iteration procedure the two point pdf's 
have to be calculated. We again use the numerical sim- 
ulation of Langevin processes as a very efficient way to 
generate these pdf's. 

Starting from the estimates J2J the ansatz 
D 2 (Q,a,q) = Q + aq 2 is reasonable. The drift im- 
mediately follows from © and, for each parameter set 
(Q,a), one obtains a stationary distribution that equals 
the experimental one. Due to this fact the evaluation of 
the conditional pdf p(q,t + T\qo,t;Q,a) suffices to cal- 
culate the Kullback-Leibler distance. A clear minimum 
of the distance is found at (Q,a) = (1, 1) corresponding 
to the original set of parameters. The Kullback distance 
close to this minimum in the two-dimensional parameter 
space is exhibited in fig. El 



B. Application to potential systems 
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FIG. 3: Segment of the two-dimensional synthetic time series 
II. 




FIG. 4: The Kullback distance K(Q) as a function of the noise 
strength Q (time series II). A minimum is clearly visible at 
the value Q — 0.05. 



The procedure for one-dimensional systems can be im- 
mediately extended to higher dimensions if one restricts 
the analysis to the so-called class of potential systems for 
which the drift vector D 1 (q) is obtained from a potential 
K(q) and gik — ^J~Q5ik- The central point of our analysis 
is the following exact expression for the stationary pdf 

/(q) = Ne~ v ^' Q . (9) 

Since the stationary pdf can be estimated from experi- 
mental data one may parameterize the class of stochastic 
processes by the single variable Q. Thus the drift func- 
tion can be taken to be fixed except for the value Q: 

D x (q) =QVln/(q) . (10) 



As an example we consider the two-dimensional system 
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This dynamical system arises as order parameter equa- 
tions for instabilities in nonequilibrium systems and 
has applications from the fields of pattern formation in 
nonequilibrium systems to pattern recognition |lj . It ex- 
hibits the features of multistability and selection. We 
considered the case e = 0.25 and B = 2 (time series II). 
These parameters yield four stable fixpoints of the dy- 
namics on the axes at |q| = 1/2 and unstable fixpoints 
at the origin and on the bisectional lines at |q| = V6/6. 

Data with time increments 10 _1 for the datapoints has 
been generated with a time step 10 -5 for the integration 
of the Langevin equations. The simulated time series II 
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FIG. 5: Time series II: Drift vector field extracted from data 
using the optimal value of Q. Unstable fixpoints in the center 
and on the bisectional line as well as the attractive fixpoints 
are clearly visible. 

with Q = .05 consists of 5 • 10 6 data points. Figure [3] 
exhibits a segment of the generated data. 

We analyzed the time series as outlined above. State 
space in this case is divided in 100 x 100 equidistant bins. 
Since the drift D x (q) can be evaluated from (|10fl all pa- 
rameters are fixed except for the noise strength Q. 

After evaluating the Kullback measure for various val- 
ues of Q this value has to be optimized. The optimal 
value is determined by the minimum of the Kullback dis- 
tance. For the present case the minimum can easily be 
determined by graphical means. 

Fig. 0] shows the Kullback distance K(Q) as a func- 



tion of the noise strength Q for the time series II. The 
minimum is clearly visible at Q — 0.05 and agrees with 
the one used for simulation. With this parameter the 
drift vector field can be recalculated from the stationary 
distribution based on relation (|10fl . The resulting drift 
vector field of dataset II is exhibited in fig. [S] 



IV. CONCLUSION 

Summarizing, we have outlined an operational method 
for the estimation of drift and diffusion terms from ex- 
perimental time series of stochastic Langevin processes. 
In contrast to previous approaches the present algorithm 
does not rely on estimating conditional moments in the 
small time increment limit. Although this limit yields 
a first approximation an iterative refinement of the esti- 
mated stochastic process is performed by minimization 
of the Kullback-Leibler distance between estimated and 
measured two time probability distributions. The pro- 
posed procedure solves the problem of estimating drift 
and diffusion terms of Langevin processes from time se- 
ries. It involves the numerical solution of Langevin equa- 
tions with parameter dependent drift and diffusion terms, 
an evaluation of the Kullback-Leibler integral (which 
may be determined by means of a Monte-Carlo method) 
and an optimization procedure, for which standard ap- 
proaches can be used. All involved steps are based on 
routine calculations. Furthermore, restriction to certain 
classes of stochastic processes like potential systems can 
drastically lower the numerical efforts of the procedure. 
Therefore, the proposed algorithm can be applied also to 
systems with higher dimensional state spaces. 
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